Goto

Collaborating Authors

 initial value


Appendix for On Effective Scheduling of Model based Reinforcement Learning

Neural Information Processing Systems

We call c(m) the m-step concentrability of a future-state distribution and call Cρ,µ the discountedaverage concentrability coefficient of the future-state distributions. The class of MDPs that satisfies this concentrability assumption is quite large, which is further discussed in Munos and Szepesvári [18]. If Xi, i = 1,...,N is an i.i.d. And when q = 1, N is used instead of N1. From the definition, one can esasily see that Nq,FX1:N N. Lemma A.2. (Single Iteration Error Bound) Let Vk and Vk+1 be the value functions of iteration kand k+1, and Vmax = rmax/(1 γ).


Invariance . the Initialized

Neural Information Processing Systems

In this paper, we analyze neural networks trained on high-dimensional data that lies on a low dimen-441 sional linear subspace denoted by P. We assume that the dimension of P is d ℓ. Throughout the pa-442 per it will be more convenient to analyze data which lies on the subspace M = span({e1,...,ed ℓ}),443 because then the "off manifold" directions correspond exactly to certain coordinates of the input. In444 this section we show that we can essentially analyze the data as if it is rotated to lie on M, and it445 would imply the same consequences as the original data from P.446 Theorem A.1. Let P Rd be a subspace of dimension d ℓ, and let M = span{e1,...,ed ℓ}.447 Let R be an orthogonal matrix such that R P = M, let X P be a training dataset and let448 XR = {R x: x X}.




Appendix A Derivation of Equation (7) 561

Neural Information Processing Systems

Table 5 shows the positioning of FedL2P against existing literature. This personalized policy can either 1) be fixed, e.g. FedEx which randomly samples per-client hyperparameters from learned categorical distributions. Scenarios where it's expensive to train from scratch for a new group of clients, e.g. This is illustrated in Section 4.4 where we adapt a publicly available pretrained Scenarios where it's important to also maintain a global model with high initial accuracy - Note that our approach also does not critically depend on the global model's performance.



BayesTune: Bayesian Sparse Deep Model Fine-tuning

Neural Information Processing Systems

Deep learning practice is increasingly driven by powerful foundation models (FM), pre-trained at scale and then fine-tuned for specific tasks of interest. A key property of this workflow is the efficacy of performing sparse or parameter-efficient fine-tuning, meaning that by updating only a tiny fraction of the whole FM parameters on a downstream task can lead to surprisingly good performance, often even superior to a full model update. However, it is not clear what is the optimal and principled way to select which parameters to update. Although a growing number of sparse fine-tuning ideas have been proposed, they are mostly not satisfactory, relying on hand-crafted heuristics or heavy approximation. In this paper we propose a novel Bayesian sparse fine-tuning algorithm: we place a (sparse) Laplace prior for each parameter of the FM, with the mean equal to the initial value and the scale parameter having a hyper-prior that encourages small scale.


Quantum-Enhanced Reinforcement Learning for Accelerating Newton-Raphson Convergence with Ising Machines: A Case Study for Power Flow Analysis

arXiv.org Artificial Intelligence

The Newton-Raphson (NR) method is widely used for solving power flow (PF) equations due to its quadratic convergence. However, its performance deteriorates under poor initialization or extreme operating scenarios, e.g., high levels of renewable energy penetration. Traditional NR initialization strategies often fail to address these challenges, resulting in slow convergence or even divergence. We propose the use of reinforcement learning (RL) to optimize the initialization of NR, and introduce a novel quantum-enhanced RL environment update mechanism to mitigate the significant computational cost of evaluating power system states over a combinatorially large action space at each RL timestep by formulating the voltage adjustment task as a quadratic unconstrained binary optimization problem. Specifically, quantum/digital annealers are integrated into the RL environment update to evaluate state transitions using a problem Hamiltonian designed for PF. Results demonstrate significant improvements in convergence speed, a reduction in NR iteration counts, and enhanced robustness under different operating conditions.



Physics Guided Machine Learning Methods for Hydrology

arXiv.org Artificial Intelligence

Streamflow prediction is one of the key challenges in the field of hydrology due to the complex interplay between multiple non-linear physical mechanisms behind streamflow generation. While physics based models are rooted in rich understanding of the physical processes, a significant performance gap still remains which can be potentially addressed by leveraging the recent advances in machine learning. The goal of this work is to incorporate our understanding of hydrological processes and constraints into machine learning algorithms to improve the predictive performance. Traditional ML models for this problem predict streamflow using weather drivers as input. However there are multiple intermediate processes that interact to generate streamflow from weather drivers. The key idea of the approach is to explicitly model these intermediate processes that connect weather drivers to streamflow using a multi-task learning framework. While our proposed approach requires data about intermediate processes during training, only weather drivers will be needed to predict the streamflow during testing phase. We assess the efficacy of the approach on a simulation dataset generated by the SWAT model for a catchment located in the South Branch of the Root River Watershed in southeast Minnesota. While the focus of this paper is on improving the performance given data from a single catchment, methodology presented here is applicable to ML-based approaches that use data from multiple catchments to improve performance of each individual catchment.